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Abstract 

Background: Detecting genetic variation is a critical step in elucidating the molecular mechanisms underlying 
phenotypic diversity. Until recently, such detection has mostly focused on single nucleotide polymorphisms (SNPs) 
because of the ease in screening complete genomes. Another type of variant, copy number variation (CNV), is 
emerging as a significant contributor to phenotypic variation in many species. Here we describe a genome-wide 
CNV study using array comparative genomic hybridization (aCGH) in a wide variety of chicken breeds. 

Results: We identified 3,154 CNVs, grouped into 1,556 CNV regions (CNVRs). Thirty percent of the CNVs were 
detected in at least 2 individuals. The average size of the CNVs detected was 46.3 kb with the largest CNV, located 
on GGAZ, being 4.3 Mb. Approximately 75% of the CNVs are copy number losses relatively to the Red Jungle Fowl 
reference genome. The genome coverage of CNVRs in this study is 60 Mb, which represents almost 5.4% of the 
chicken genome. In particular large gene families such as the keratin gene family and the MHC show extensive CNV. 

Conclusions: A relative large group of the CNVs are line-specific, several of which were previously shown to be related 
to the causative mutation for a number of phenotypic variants. The chance that inter-specific CNVs fall into CNVRs 
detected in chicken is related to the evolutionary distance between the species. Our results provide a valuable resource 
for the study of genetic and phenotypic variation in this phenotypically diverse species. 
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Background 

The chicken was the first livestock species to have its 
genome completely sequenced [1]: a large collection of 
chicken single nucleotide polymorphisms (SNPs) has 
been available for almost a decade [2]. More recently, 
the number of SNPs has been enlarged to over 7 million 
[3]. Although numerous studies studying genetic vari- 
ation have focused on SNPs, there is growing evidence 
for the substantial role of structural polymorphism in 
phenotypic diversity [4]. Structural variation has been 
recognized as an important mediator of gene and gen- 
ome evolution within populations [5]. While the sizes of 
genetic variants range from a single base to whole chro- 
mosomes, historically only the extreme ends of the 
spectrum have been explored. DNA copy number 
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variants (CNVs) lie between these two extremes, ranging 
in size from thousands to millions of bases. 

In human, many CNVs are in linkage disequilibrium 
with nearby genetic markers and thus appear to be an- 
cient [6]. Others are more recent, such as CNVs af- 
fecting olfactory receptor gene diversity [7], or can be 
recurrent [8]. Structural variants include a variety of mo- 
lecular alterations such as duplications, deletions, and 
inversions [9,10]. A comprehensive map that catalogues 
and indexes structural variants - in particular CNVs - 
across the genome is a necessary prelude to understand- 
ing their role in the context of specific phenotypic traits. 
Early reports estimated that at least 2% of the human 
genome is affected by structural variations [11], but 
more recent studies suggest that as much as 3.75% of 
the human genome harbors common CNVs [12]. 

CNV regions (CNVRs) will be an important comple- 
ment to SNP-centric genome-wide association studies, 
since existing SNP discovery and genotyping methodolo- 
gies are biased against inclusion of these more complex 
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genetic variants. Furthermore, many of the CNVRs are 
not very well represented and annotated in the genomic 
sequence due to biases in chromosome assembly. In 
order to estimate what fraction of the genome is affected 
by CNV, global studies have been performed in human, 
chimpanzee, dog, mouse and cattle. In cattle, for ex- 
ample, 177 high confidence CNVRs were reported as 
covering 28.1 Mb, 35 of these CNVRs being apparently 
breed-differential or breed-specific [13]. To determine 
the full extent of variation and its influence on pheno- 
typic variation, the reference genome assembly should 
be near completion and more individual genomes need 
to be sequenced for the species of interest. Analysis of 
CNVs in livestock species is of particular interest, not 
only because of their economic importance, but also due 
to the often- extensive selection pressure applied in 
generating the different lines and varieties. 

Since CNVs potentially affect gene expression [8], 
CNVs may account for a significant proportion of the 
extensive phenotypic variation observed in this species. 
Examples of phenotypes associated with a CNV in the 
chicken include late feathering on chromosome Z 
(GGAZ) [14], pea comb on GGA1 [15], dark brown 
plumage color on GGA1 [16] and dermal hyperpigmen- 
tation on GGA20 [17]. Additional CNVs have been 
detected in the chicken using aCGH [18], but that study 
only examined ten individuals and only identified 96 
CNVs corresponding to approximately 1.3% of the 
chicken genome. Furthermore only 27 of these CNVs 
were observed in more than one individual. 

Here we applied an aCGH analysis to different chicken 
breeds in order to obtain a global CNV map of the 
chicken genome. 

Results and discussion 

CNV in chicken 

aCGH was carried out using the Agilent 244K chicken 
array with a mean probe spacing of 4000 bp. This array 
is based on the chicken assembly WUSTL 2.1 (GalgaB) 
and covers chromosomes 1-28, 32 and the sex chromo- 
somes Z and W. The virtual chromosome "ChrUn" with 
concatenated unmapped contigs was not taken into ac- 
count in the probe design. To access the chicken CNV 
landscape, we selected 64 animals from 6 commercial 
lines (layer and broiler types), 7 experimental lines (layer 
and broiler types), Red Jungle Fowls and Silkies. DNA 
samples were labeled with Cy3 whereas the reference 
DNA sample - derived from UCD001, the Red Jungle 
Fowl animal previously selected to generate the chicken 
reference genome assembly - was labeled with Cy5. 

We defined conservative parameters for CNV detec- 
tion to limit false positive calls (see Methods). Within 
the 15 lines used in this study, 3,154 CNVs with a diffe- 
rent start and/or end location on the chicken genome 



were detected (Additional file 1). Seventy- five percent of 
these CNVs are losses. Being more conservative, i.e. re- 
quiring the CNV to be observed in at least 2 samples, 
944 CNVs (29.9%) were detected with an average size of 
46.1 kb. The real time PCR validation of 12 CNVs ran- 
ging in being present in 1 to 41 samples did give a suc- 
cessful validation of 92%. Only one marker in a potential 
CNV which was detected in only one animal failed valid- 
ation by showing no difference between the reference 
sample. These results indicated that the detected CNVs 
have a high chance of being a real CNVs. Furthermore, 
we could confirm 13 of the 26 high confidence CNVs 
(50%) identified by Wang et al. (2010) and 23 out of the 
70 CNVs (23.9%) detected in only one animal in that 
study. However, we were only able to confirm 21% of 
the 238 CNVs detected in another study (Wang et al. 
2012) (Additional file 1). One reason why fewer CNVs 
were detected in the studies of Wang et al. [18,19] is the 
use of only 10 and 6 animals respectively from three dif- 
ferent breeds in those two studies. Moreover, both stu- 
dies used a different reference animal and the reference 
animal was also from the same breed. Also, none of the 
chicken breeds were in common between our study and 
those of Wang [18,19] which may account for the lack 
of complete validation. 

The distribution of the gain and loss CNVs over the 
genome is shown in Figure 1. The average size of the 
CNVs detected in this study is 46.3 kb, and the largest 
CNV, 4.3 Mb (CNV #3126), was observed on GGAZ. 
CNV distributions within the different chicken lines are 
given in Additional file 1: Table SI. Although limited 
sequence information is available for GGA16 (MHC- 
containing chromosome), the repetitive nature of this 
chromosome [1] was confirmed by detecting copy num- 
ber variations on the entire sequenced part of this 
chromosome. Another striking case was seen on GGA25 
where 16 out of the 64 animals showed CNVs in a 
CNVR covering a substantial part of the genome se- 
quence (from 0 up to 1.85 Mb). Interestingly, GG A25 is 
one of the more GC-rich chromosomes and it contains a 
relatively large number of minisatellites [20]. 

The average number of CNVs per animal was 103, 
ranging from 61 to 209. The highest number of CNVs 
was detected in the commercial White Leghorn line with 
an average of 187.5 CNV per animal. The lowest number 
of CNVs was observed in the Red Jungle Fowl (Aviandiv) 
population with an average of 83.8 CNV per animal, 
which was expected as an inbred Red Jungle Fowl ani- 
mal was used as a reference in this study. The commer- 
cial broilers showed an average of 128.8 CNVs per 
animal. These numbers are considerably higher than 
those reported by Wang et al. (2012), where the average 
was 40 CNVs per animal. Even when our analysis is re- 
stricted to the autosomes, as was done in the study of 
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Figure 1 CNV distribution across the chicken genome. Red bars indicate copy-number losses and blue bars copy-number g, 
indicates the number of occurrence for a given CNV, divided into four groups: one contains > 10 animals carrying this CNV; the 
and <10 animals; the third is > 2 and <5 and the last group has only one animal carrying this CNV. 
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Wang et al (2012), we still observe many more CNVs 
(118.8) per individual. A cluster analysis of the samples 
based on the CNVs detected in each of the animals 
results in tight clustering of all individuals from the 
same line (Figure 2). 

CNVRs were determined by aggregating overlapping 
CNVs identified in all samples across the aCGH experi- 
ments according to the criteria defined by Redon et al. 



(2006). Aggregating CNVs into CNVRs resulted in a 
total of 1,556 non-overlapping regions covering 60 Mb, 
which represent almost 5.4% of the chicken genome. An 
example of a CNVR in the chicken is given in Figure 3. 
The largest CVNR detected is located on GGAZ (CNVR 
1482) and is 4.37 Mb in size. The number of CNVRs in 
the chicken is considerably higher than that reported by 
Wang et al. (2010) and Wang et al. (2012), 97 and 130 
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Figure 2 Heatmap representation of copy number variation between chickens. Unsupervised clustering of CNVs with gains (green) and 
losses (red) yields a dendrogram that recapitulates features of the known genealogy of these animals within a line or breed. 



respectively. The number of CNVRs in the chicken is 
comparable to that reported in human [21] and almost 3 
times higher than in cattle [22]. 

The 176- kb CNV linked to the late feathering locus 
[14] was detected in this dataset as CNVR 1508 on 
GGAZ between positions 9,971,185 and 10,140,048 with 
a size of 169 kb. The segregation of this CNV is shown 
in Figure 4. As expected, we were unable to identify the 
CNV in intron 1 of the SOX5 gene responsible for the 
pea-comb phenotype in chicken due to the small size of 
this CNV (3.2 kb), which is below the probe spacing on 
our array. 

Line-specific CNVRs 

Some CNVs were observed only in a single animal 
whereas others seem to be fixed in all individuals of one 
specific line. CNVs that are specific for a line or group 
of lines are of particular interest because these are 
potential candidates for genes that affect a phenotype 
specific for that (group of) lines. We therefore identified 
those CNVs that were either fixed in at least one line 
(defined as fixed) or that were fixed in only a single line 
or breed (defined as line-specific). 

Within the 15 different lines used in this study, we 
identified 518 CNVs, comprising a total of 214 CNVRs, 
which were line-specific (Additional file 2). The number 
of CNVRs fixed within a line ranged from 2 in the 



Broiler Mapping population to 68 in the commercial 
White Layer. The number of line-specific CNVRs varied 
from 0 for the Broiler Mapping populations to 30 for the 
commercial White Layer line. The commercial White 
Leghorn line was only represented by two individuals, 
therefore resulting in a higher number of fixed CNVRs 
(68), of which 30 were line-specific. Of the fixed CNVRs, 
the number of line-specific CNVRs was high in the 
experimental lines and the Silkie breed (10 each). 

The Silkie breed has a number of striking phenotypic 
characteristics such as black skin, white feathers and 
black bones. We therefore investigated whether there is 
a relation between some of these CNVRs and some of 
these specific phenotypes. For the Silkies, 27 fixed 
CNVRs were detected of which 10 were breed-specific 
(Additional file 2). 

Two significant CNVRs are located on GGA20 (CNVR 
812 at positions 10,722,231 to 10,844,289 and CNVR 
814 at positions 11,263,937 to 11,435,137) and have 
already been described in detail by Dorshorst et al. 
(2011) after fine mapping of the phenotype fibromela- 
nosis (FM) in Silkies. The candidate gene involved in 
pigmentation, the Endothelin 3 gene (NDN3), is located 
within CNVR 812 [17] and, when up-regulated, is the 
primary driver of dermal hyperpigmentation in FM 
chickens. The potential function of the other eight line- 
specific CNVRs in Silkies is not clear. One of the three 
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Figure 3 Overview of CNVR 209 at GGA1: 144,185,960- 144,403,060. CNVR209 (217.1 kb) consists of 30 individual CNVs (see Additional file 2: 
Table S2) ranging from 8.8kb to 153.7kb in size across all samples. The frequency of the CNVs within this CNVR varied from 1 to 10 across the 
study population. Line thickness represents the number of occurrences of the CNV. Gains are indicated with black lines, losses with blue lines. 



Silkie line-specific CNVR on GGA27 (CNVR 889 at po- 
sitions 4,128,916 to 4,155193) harbours the gene CCR7, 
which stimulates melanoma migration, and the gene 
SMARCA4, a SWI/SNF-related matrix-associated actin- 
dependent regulator of chromatin. Both these genes are 
potential candidates for traits related to pigmentation. 
Further studies are needed to study the full potential of 
these line-specific CNVRs. 

For lines 6 2 and 6 3 , we detected 29 and 36 CNVRs 
respectively, which are fixed in these lines, while only a 
single CNV and 3 CNVRs respectively are line-specific. 
One of the line-fixed CNVRs for line 6 (CNVR 209 on 
GGA1 between positions 144,249,310 and 144, 403,060) 
contains the gene TNFSF13B (tumor necrosis factor 
(ligand) superfamily, member 13b) or BAFF (B cell 



activation factor). This candidate gene stimulates B cells 
to undergo proliferation and to counter apoptosis and 
was examined in more detail. To confirm this CNVR, 
we quantified the relative abundance of the DNA copy 
number for TNFSF13B using a TaqMan assay. Quantifi- 
cation of the ovotranferrin gene, known to be in single 
copy in the chicken genome, was used as an internal ref- 
erence. A primer probe set spanning exon 5 and intron 
6 of TNFSF13B revealed a significant difference in copy 
number between line 6 X when compared to lines 7 and 
N. No difference was detected using a primer probe set 
spanning intron 1 and exon 1. These results suggest that 
there is partial duplication of TNFR13B, with exon 5 
duplicated in all lines tested except in lines 6 2 and 6 3 . 
However, the CNV does not extend as far as exon 1 of 
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Figure 4 Segregation of CNVR1508 at GGAZ: 9,971,185- 
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female k+/- (1 copy) giving a female offspring K/- (2 copies). 



TNFSF13B, as indicated by the equivalent copy number 
across all lines at this region of the gene. Further charac- 
terisation of this CNV will be required to identify its 
boundaries accurately. 

One of the three other line-specific CNVRs (CNVR521), 
detected in lines 6 2 and 6 3) and located at GGA18: 
372390-400489, overlaps with the Myosin Heavy Chain 
gene 1 (MYH1). 



Comparative CNV analysis 

We compared the chicken CNVRs to those previously 
detected in turkey, duck and zebra finch (Additional file 1) 
[23-25]. From the 16 inter-specific CNVs detected 
between turkey and chicken detected by Griffin et al. 
(2008) using comparative CGH, 10 did not show variation 
in the chicken. Within the current study, 15 of these 16 
inter-specific CNVs could be verified, whereas the inter- 
specific CNV on chrE64 could not be validated because 
chrE64 was not used for probe design on the array. From 
the 15 inter-specific regions detected, all (100%) fall into 
CNVRs detected in this study. The CNV detected in the 
chicken Layer vs. Red Jungle Fowl on GGA2 (position 
25,725,000 to 25,785,000) by Griffin et al. (2008) could not 
be confirmed in our study. When comparing the inter- 
specific zebra finch / chicken CNVs [25], 9 of the 27 CNVs 
(33%) did overlap with a chicken CNVR while of the 
inter-specific CNVs detected between duck and chicken 
[24], 15 of the 31 (48%) did overlap with a chicken CNVR. 
These results indicate that the inter- specific CNVs 
detected are prone to overlap with a CNVR of the avian 
lineage when more samples are analyzed. Fewer CNVRs 



would be overlapping with inter-specific CNVs between 
more distant species. 

Gene content of chicken CNVR 

Within the 1,556 CNVRs, a total of 2,642 unique 
Ensembl peptides were identified based on chicken 
build 2.1. To examine whether genes involved in spe- 
cific pathways or biological processes are more prone 
to copy number variation, we performed a gene en- 
richment analysis for the genes located within the 
CNVRs. The chicken transcript ids were used as input 
into DAVID for a gene enrichment and ontology ana- 
lysis [26]. Terms showing significant enrichments were 
the GO terms "functional constituent of cytoskeleton", 
"nuclear binding", "cellular response to stress", and 
"macromolecule catabolic processes". The GO term 
"functional constituent of cytoskeleton" is mainly 
driven by the keratin superfamily. The avian keratin 
genes are over-represented when compared to mam- 
mals [1]. Phylogenetic analysis demonstrated that evo- 
lution of archosaurian epidermal appendages in the 
linage leading to birds was accompanied by duplication 
and divergence of the ancestral E-keratin gene cluster. 
In the chicken, four subfamilies (claw, feather, feather- 
like and scale) of the E-keratin genes have been named 
in accordance with tissue-specific expression and se- 
quence heterogeneity [27]. These E-keratin gene sub- 
families are clustered on GGA25 whereas the genes 
for two other monophyletic groups of feather keratins 
are located on GGA27 and GGA2 respectively. We ob- 
served large CNVRs (CNVRs 863, 873 and 791) within 
all three regions in the chicken genome up to 2 Mb in 
size. Within these CNVRs we observed both CNV 
losses and gains. 

Conclusions 

In this study we performed aCGH screening of the 
chicken genome to identify CNVs in a comprehensive 
manner. We have identified a large number of genes af- 
fected by CNV, including genes involved in well-known 
phenotypes such as late feathering and pigmentation in 
Silkies. In particular large gene families such as the kera- 
tin gene family and the MHC show extensive variation 
in copy number. The CNVs in the chicken overlapping 
with the inter-specific CNVs (CNVs detected between 
different bird species) are potentially old CNVs. More- 
over, when the evolutionary distance between chicken 
and the other bird species is enlarged the older (more 
ancient) the CNV is. Many of these CNVs very likely 
affect traits of economic importance in the chicken and 
our global characterization of CNVRs in the chicken 
genome will aid in the identification of structural vari- 
ation in the genome underlying important phenotype 
differences for qualitative and quantitative traits. 
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Methods 

Construction of the oligonucleotide microarray 

A CGH array for whole genome analysis in chicken 
(UCSC galGaB (WUSTL build 2.1, may 20006)) was 
designed and constructed by Agilent Technologies 
(http://www.genomics.agilent.com). The chicken genome 
CGH microarray kit 244 A had a median probe spacing 
of 4 kb, with probes printed using the Agilent 60-mer 
Sure print technology. 

Experimental chicken lines 

The experimental lines 6 (line 6 2 and 6 3 ), 7 2 and 15I 5 are 
all experimental White Leghorn lines characterized for 
resistance to viral-induced tumours. Line 6 was been 
selected for resistance to Mareks disease (MD) and 
lymphoid leukosis (LL) whereas line 7 2 is susceptible to 
MD and 15I 5 is susceptible to MD and LL [28]. Lines 6 3 , 
7 2 , and 15I 5 are kept at the Avian Disease and Oncology 
Laboratory (ADOL) at East Lansing, MI, USA, while lines 
6i and 7 2 are bred at the Pirbright Institute, Compton, 
UK. Line N is a control line. Line BrL is a Brown Leghorn 
line selected for resistance to infectious bursal disease 
virus (IBDV) [29-31]. All lines have been maintained by 
random mating within the flocks. 

Sample processing 

Breeds in this study include Red Jungle Fowl, one com- 
mercial white and two commercial brown layer lines, six 
experimental lines (five white and one brown) three 
commercial broiler and one experimental broiler line, 
and the Silkie breed. In total 64 animals were used 
(Table 1). Genomic DNA was isolated from blood with 

Table 1 Sample information 



the Puregene blood kit or the Qiagen Qiamap DNA 
blood kit. Blood samples were collected by veterinarians 
according to national legislation. No approval from the 
ethics committee was necessary according to local le- 
gislation. For the experimental lines from the Institute 
for Animal Health at Compton, genomic DNA was 
extracted from whole blood as previously described [32]. 
We assessed the DNA quality and quantity by OD 2 6o/280 
and OD 260/230 readings and on 1% agarose gels. The re- 
ference sample (UCD001) used is the same individual 
that was used to generate the chicken genome reference 
sequence [1]. 



aCGH data analysis and CNV calling 

Unamplified genomic DNA (1 ug) was labeled with Cy3 
(test samples) or Cy5 (reference sample). The Agilent 
Oligonucleotide Array-based CGH for genomic DNA 
Analysis protocol (v4.0:2006) was used for the labeling 
of the DNA, Hybridizations, washings, and scanning 
of the arrays. Self-self control hybridizations were 
performed by labeling the reference sample with Cy3 
and Cy5. Fluorescence intensities ratios were extracted 
using Agilent Feature Extraction software (Agilent Tech- 
nologies). Log2ratio profiles were then normalized using 
aCGH-Spline to remove dye biases and reduce experi- 
mental noise [33]. CNV detection was performed using 
a modified version of CNVfinder [32], where we opti- 
mized empirically significance thresholds using replicate 
self-self hybridizations. The CNVRs were obtained by 
merging overlapping CNVs according to similar criteria 
as described previously [21,34]. 



Breed 


Line 


Breed code 


Platform 


# Samples 


Relation 


Red Jungle Fowl 


Red Jungle Fowl 


aviandiv_101 


Aglilent 244K 


4 


unrelated (FO) 


Exp.Layer brown 


Compton_BrL 


LineBrL 


Aglilent 244K 


5 


unrelated (FO) 


Exp. Layer white 


Compton_6_sub1 


Line6T 


Aglilent 244K 


5 


unrelated (FO) 


Exp. Layer white 


Eastlansing_6_sub3 


Line6 3 


Aglilent 244K 


5 


unrelated (FO) 


Exp. Layer white 


Compton-N 


LineN 


Aglilent 244K 


5 


unrelated (FO) 


Exp. Layer white 


Compton_15l 


Line15l 5 


Aglilent 244K 


5 


unrelated (FO) 


Exp. Layer white 


Compton_7_sub2 


Line7 2 


Aglilent 244K 


5 


unrelated (FO) 


Com. Broiler 


commercial broiler_A 


ComBroilA 


Aglilent 244K 


2 


unrelated (FO) 


Com. Broiler 


commercial broiler_B 


ComBroilB 


Aglilent 244K 


3 


unrelated (FO) 


Com. Broiler 


commercial broiler _C 


ComBroilC 


Aglilent 244K 


4 


unrelated (FO) 


Exp. Broiler 


broiler_M 


BroilM 


Aglilent 244K 


8 


unrelated (FO) 


Com. Layer brown 


commercial layer brown_1 


ComBrownLI 


Aglilent 244K 


4 


unrelated (FO) 


Com. Layer brown 


commercial layer brown_2 


ComBrownL2 


Aglilent 244K 


4 


unrelated (FO) 


Com. Layer white 


commercial layer white 


ComWhiteL 


Aglilent 244K 


2 


unrelated (FO) 


Silkie 


Silkie 


Silkie 


Aglilent 244K 


3 


unrelated (FO) 


total 








64 
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Quantitative RT-PCR 

Relative abundance of DNA copy number for candidate 
CNVs was quantified by TaqMan quantitative PCR using 
an adapted method previously described [35]. 

Primers and probes for TNFSF13B and ovotranferrin 
were designed using Primer Express (Applied Biosystems) 
(Additional file 3). For the PCR we used lx real-time PCR 
mix, TNFSF13B forward primer (0.4 uM), and TNFSR13B 
reverse primer (0.4 uM), ovo forward primer (0.4 uM), 
and ovo reverse primer (0.4 uM), TNFSF13B FAM probe 
(0.2 uM), ovo VIC probe (0.2 uM), bovine serum albumin 
(10 ug per reaction). The PCR was performed using the 
TaqMan fast universal PCR master mix reagents (Applied 
Biosystems, Warrington, UK). Amplification and detection 
of specific products was performed using the Applied 
Biosystems 7500 Fast Real-Time PCR System with the fol- 
lowing thermocycling parameters: 50°C for 2 min, 95°C 
for 10 min, followed by 40 cycles of 94°C (15 sec) and 
60°C (1 min). Results are expressed in terms of the 
threshold cycle value (C t ), the cycle at which the 
change in the reporter dye passes a significance thresh- 
old (AR n ). 

To account for variation in sampling and DNA prepar- 
ation, the C t values for TNFSF13B-specific product for 
each sample were normalized using the C t value of the 
ovotransferrin product for the same sample. Normalized 
C t values were calculated using the formula C t + (Nt - 
Ct) x S/S\ where Nt is the mean C t for ovotransferrin 
among all samples, C t ' is the mean C t for ovotransferrin in 
the sample and S and S 1 are the slopes of the regressions 
of the standard plots for the test TNFSF13B and 
ovotransferrin, respectively. This effectively achieves inter- 
polations on the standard plots to obtain the TNFSF13B 
C t values that would have been obtained had all samples 
had the same (mean) amount of ovotransferrin DNA. 

Additional validation was performed using a quantita- 
tive PCR approach, as described by Weksberg et al. [36], 
to investigate the difference CNVs. Copy number was 
determined for 12 markers in 12 different CNVs. Pri- 
mer3 webtool http://frodo.wi.mit.edu/primer3/ was used 
to design primers for qPCR validation. Amplicon length 
was limited between (50 bp - 100 bp) and regions with 
GC percentage between 30% and 60% were included, 
while avoiding runs of identical nucleotides. All other 
settings were left at their default. Details of the qPCR 
primers can be found in Additional file 4: Table S5. 
qPCR experiments were conducted using MESA Blue 
qPCR MasterMix Plus for SYBR Assay Low ROX from 
Eurogentec, this 2x reaction buffer was used in a total 
reaction volume of 12.5 ul. All reactions were amplified 
on 7500 Real Time PCR system (Applied Biosystems 
group). The copy number differences were determined 
by using a standard ACt method that compares the 
mean Ct value of the target CNV fragments, determined 



from different input concentrations, compared to the 
mean Ct value of reference sample (UCD001). 

Functional gene annotation 

Functional gene annotation is performed in DAVID 
(gene Functional Classification Tool, DAVID Bioinfor- 
matics Resources 6.7, NIAID/NIH at http://david.abcc. 
ncifcrf.gov [26]. Ensemble gene ids within CNVR were 
collected (Additional file 5) and used as the input file for 
DAVID. The EASE score and the modified Fisher Exact 
P -Value were given where the smaller, the more 
enriched. Cut off P- Value within this study was E10" 4 . 

Additional files 
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in other avian species. 

Additional file 2: Presents the specific CNVs either fixed or variable 
within a line or over lines. Yellow blocks indicate fixed CNV within a 
line, green blocks represent fixed CNVs in a certain group of lines 
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